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1 Introduction 

Volumetric visualization has been an active research 
topic since medical devices made possible to acquire 
volumetric data from inside the human body. 

In 3D MRI acquisition sequences, the data is col¬ 
lected in the form of an equidistant lattice, with a 
scalar value associated with each lattice element. 

The resolution of MRI devices has been constantly 
increasing, and we have reached the point where the 
typical amount of voxels (256 3 or greater) is too high 
for real-time brute-force visualizations on commonly 
available PC hardware. 

The apparent usefulness of real-time volumetric visu¬ 
alization has driven the research of this field. Many 
optimizations, approximations and advanced algo¬ 
rithms have been developed and refined for real¬ 
time operation. In the object-order category of al¬ 
gorithms we have e.g. shear-warp[l], splatting[2] 
and octrees [3]. In the image-order category we have 
e.g. proximity clouds[4][5] and optimized space- 
traversals [6]. 

In the following, we present a new image-order al¬ 
gorithm, which can theoretically speed up algorithms 
based on ray-traversal 20 times, on average. This al¬ 
gorithm doesn’t need precomputed data structures, al¬ 
lowing real-time modification of the data as well as 
real-time visualization. 

2 Methods 

In figure 1 we see the basic ray-casting scheme. The 
rays are cast from an user-defined plane and stop 
when they hit a voxel or when a maximum length is 
achieved. In discrete ray-casting these rays are trav¬ 
elled in discrete steps. If the the number of discrete 
steps needed to travel ray i is <f, and the total number 
of rays is N, then the computational cost is propor¬ 
tional to 
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Figure 1: Basic ray-casting. 


where D is the average number of steps needed per 
ray. 

D N is usually a very large number. E.g. with a 256 3 
volume, 256 2 window and 20 cycles per step, we have 
a total computational cost of approximately 

20 • D ■ N « 20 ■ 128 ■ 256 2 = 167 772 160 (cycles). 

That is, even with a GHz computer we only get about 
6 frames per second. 

Since we don’t want to reduce N (number of pixels), 
we reduce D with an iterative algorithm using three 
steps. Figure 2a shows the first step. We cast only 
every n:th ray (compare with fig. 1.) We call these 
rays “probe rays.” The computational cost of step 1 is 
proportional to 



In the second step (fig. 2b) we construct new planes 
between the probe rays. These new planes are parallel 
to the original plane and go through the voxel at the 
end of the probe ray which has the shortest length. 
The computational cost of this step can be considered 
negligible. 




a) Step 1. Cast probe rays. 



b) Step 2. Make planes. 



c) Step 3. Cast remaining rays. 
Figure 2: Phases of the algorithm. 




In the third step (fig. 2c) we start casting the remain¬ 
ing N — rays from the constructed planes. The 
computational cost of this step is proportional to 

Arew(n) ■ (N - ^). 

Dnew(n) is the average depth of the rays starting 
from the constructed rays and should be considerably 
smaller than D. -DnewM a is° depends on the dis¬ 
tance between the probe rays. 

We approximate average case perfomance by consid¬ 
ering the case when the voxel surface is in a 45 degree 
angle with the original plane (figure 3.) In this case, 
-Dnew {n) = f- The total computational cost can then 
be found by adding the different costs of the steps: 
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Figure 4 shows the latter factor with D = 128 (the 
approximated average distance of a plane from a 25f? 






Figure 5: Performance of our program. 

volume) as a function of n. The minimum comes with 
n k 8, and the computational cost compared to the 
brute-force algorithm is 



A 21-fold acceleration is sufficient to give real time 
performance, since brute-force algorithms (D ■ N ) 
perform at a couple of frames per second. 

3 Results 

We implemented this algorithm in software on an 
Athlon 500MHz system running Linux as the op¬ 
erating system. Figure 5 shows fps vs. n with a 
256 x 258 x 170 MRI volume. From the figure we 
see that the brute-force performance is 1 fps (n = 1.) 
The plot also shows a peak around 8, as predicted by 
figure 4. Top performance is only 14 fps instead of 
21 fps. This suggests that the cost of step 2 is not 
actually negligible. Otherwise, the overall shape of 
figure 5 is as predicted by figure 4. 
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4 Discussion 

That the performance depends so strongly on the dis¬ 
tance between the probe rays was a big surprise to us. 
One might think that casting every second ray would 
be a good optimization, but figure 5 clearly shows that 
this is not the case. With n = 2 we get only 2- to 
3-fold acceleration compared to the massive 14-fold 
acceleration with n « 8. 

This algorithm can render large, continuous volumes 
efficiently. MRI data falls into this category. Small 
and discontinuous fMRI and MEG data can be ren¬ 
dered better using back-projection methods. Combin¬ 
ing different kinds of data can be done after rendering 
with transparency operations. 



